Journal of Clinical and Translational Science — Latest Matching Preprints

1

Expanding Faculty Representation in US Academic Neurological Surgery: Achievements and On-going Challenges.

Shireman, J.; Mukherjee, N.; Brackman, K.; Kurtz, N.; Patniak, A.; McCarthy, L.; Gonugunta, N.; Ammanuel, S.; Dey, M.

2026-04-27 medical education 10.64898/2026.04.24.26351672 medRxiv

Top 0.1%

3.9%

Show abstract

Objectives: Academic medical institutions are the gatekeepers of the physician workforce and shape the future of medicine by regulating medical school admissions as well as residency training. Although broadly the field of medicine is seeing more representation from traditionally underrepresented groups, the critical decision-making platform of academic medicine continues to be uncharacteristically homogeneous, represented mainly by white males. This is even more pronounced in surgical subspecialties, such as academic neurosurgery. This study aims to quantify this phenomenon, uncover its driving factors, and define opportunities for improvement. Methods: Using a mixed research methodology, academic neurosurgical faculty in the U.S were identified, and their demographic data was collected. An internet search using Google Scholar and Scopus was conducted to determine scholarly activity using number of publications and h-index. Results: We found a significant increase in female faculty in academic neurosurgery within the last decade. Comparing the faculty rank amongst male and female faculty, we found that the majority of female faculty are at the assistant professor level (n=36/79; 45.6%) while male faculty are more at the full professor rank (n=265/582; 45.5%). A similar trend was seen for under-represented minority neurosurgery faculty. Strong scholarly activity corelated with a departmental chair position for male faculty, however, this trend was not true for female faculty. There was a significant difference in the number of publications and h-index in female vs male faculty, but only when including male faculty outliers at the full professor level. Conclusion: Slowly but steadily, academic neurosurgery is making progress towards a more diverse and representative workforce in the U.S that better reflects the patient population. Facilitating timely progression of females and URM neurosurgeons into senior professorship and academic leadership roles will further advance this essential progress.

2

The Evolution and Equity of Chinas Pharmacist Workforce in Healthcare Institutions: A Provincial Panel Data Analysis, 2007-2023 Evolution and equity of China's pharmacist workforce

xia, y.; Sun, L.; Zhao, Y.

2026-04-23 health policy 10.64898/2026.04.22.26351514 medRxiv

Top 0.2%

1.8%

Show abstract

Background: China has implemented policies to strengthen its pharmacist workforce since the 2009 healthcare reform, yet a comprehensive evaluation of their long-term systemic effects is lacking. Objective: To systematically analyze the evolution of Chinas pharmacist workforce in healthcare institutions from 2007 to 2023 across four dimensions: quantity, quality, structure, and distribution, providing an empirical foundation for policy optimization. Methods: A retrospective analysis was conducted using longitudinal data from the China Health Statistics Yearbooks. Trends were delineated via descriptive statistics. Equity and spatial evolution were assessed using the Gini coefficient, Theil index decomposition, and spatial autocorrelation analyses (Global Morans I and hotspot analysis). Results: From 2007 to 2023, the total number of pharmacists increased from 357,700 to 569,500 (average annual growth: 2.2%). This growth lagged behind physicians (4.6%) and nurses (7.4%), causing the pharmacist-to-physician ratio to decline from 1:5.15 to 1:8.39. The workforce showed trends of feminization (female proportion rose from 59.7% to 70.8%) and aging. While quality improved, 51.1% still held an associate degree or below, and only 6.6% held senior titles. Equity analysis revealed the provincial Gini coefficient improved from 0.145 to 0.093. Theil index decomposition confirmed intra-provincial disparities as the primary inequality driver. Spatial analysis showed a non-significant global Morans I by 2023 (0.154, P*>0.05), down from 0.254 (P<0.01) in 2007. Hotspot analysis confirmed this transition, revealing a contraction of high-confidence clusters and a trend toward balanced distribution. Conclusions: China has made measurable progress in expanding pharmacist workforce size and improving inter-provincial equity since 2007. However, persistent structural challenges remain: relative workforce contraction compared to other health professions, an aging demographic, a shortage of senior talent, and significant intra-provincial inequity. Future policies must prioritize optimizing workforce structure and enhancing clinical service capabilities to catalyze a shift toward patient-centered pharmaceutical care.

3

Cardiac Rehabilitation and Functional Capacity Improvement: Montana Outcomes Project Cardiac Rehabilitation Registry Findings

Claus, L.; McNamara, M.; Oser, C.; Fogle, C.; Canine, B.

2026-04-21 public and global health 10.64898/2026.04.20.26351126 medRxiv

Top 0.3%

1.3%

Show abstract

Cardiovascular disease (CVD) remains the leading cause of mortality in the United States, despite being largely preventable through effective management of risk factors. This study evaluates the impact of Phase II cardiac rehabilitation (CR) on functional capacity and quality of life, using data from the Montana Outcomes Project Cardiac Rehabilitation Registry. Functional capacity improvements were assessed via the six-minute walk test (6MWT) and Dartmouth COOP questionnaire, with statistical analyses exploring the influence of CR session attendance, demographic factors, and referring diagnoses. Results demonstrated significant gains in 6MWT, with a mean improvement of 330.73 feet (p < .0001), and quality of life scores across all subgroups. A dose-response relationship was observed, indicating greater improvements with increased CR sessions (p < .0001), though diminishing returns were observed beyond 24-35 visits. Demographic factors and complex conditions influenced outcomes, underscoring the need for tailored strategies to enhance CR access and effectiveness. These findings highlight the critical role of CR in improving patient outcomes and emphasize the importance of addressing barriers to participation in underserved populations.

4

Demographic Factors Moderate the Effectiveness of Obesity Prevention Interventions: A Secondary Analysis of College Intervention Trials

Winn, C.; Groene, L.; Colby, S.; Ademu, L.; Olfert, M. D.; Byrd-Bredbenner, C.; Mathews, A.; Stabile Morrell, J.; Brenes, P.; Brown, O.; Barr-Porter, M.; Greene, G.; Dhillon, J.

2026-04-27 nutrition 10.64898/2026.04.22.26351238 medRxiv

Top 0.3%

1.2%

Show abstract

Background: College-attending young adults frequently experience declines in diet quality, physical activity, and psychological well-being during the transition to independent living, contributing to weight gain during the first year of college. Although multicomponent lifestyle interventions have been developed to address these behaviors, the responsiveness to such programs could differ across demographic factors associated with health behaviors, such as sex, race, and ethnicity. Hence, this secondary analysis of large-scale college health trials evaluated whether the effectiveness of such interventions differed by these demographic factors. Methods: Data were combined from two multi-site randomized controlled trials: Young Adults Eating and Active for Health (YEAH) trial and the Get FRUVED trial. Both interventions used theory-based approaches to promote healthy weight management through improvements in diet quality, physical activity, and stress management. Baseline-adjusted linear regression models evaluated the effects of group (intervention, control) and its interactions with sex, race (White, Black, Other), or Hispanic ethnicity. Models were adjusted for baseline outcome values, baseline BMI, study (YEAH vs. FRUVED), and state of data collection. Results: Intervention participants reported higher fruit and vegetable intake, lower processed meat intake, and longer sleep duration compared with controls. However, there was significant heterogeneity in these dietary outcomes by ethnicity, race, and sex. Non-Hispanic participants in the intervention group had higher fruit and vegetable intake compared to controls (p < 0.05). And, within the intervention group, Hispanic females had lower bacon/sausage intake than Hispanic males and non-Hispanic females (p < 0.05). With respect to race, Black participants reported higher total processed meat intake than White and Other race participants (p <0.05). These demographic factors did not moderate the intervention's impact on physical activity, sleep duration, and perceived stress. Overall, the intervention appeared to be the least effective for Hispanic males who exhibited higher body weight and waist circumference compared with Hispanic females and non-Hispanic males (p < 0.05). Conclusions: Multicomponent lifestyle interventions can improve selected dietary outcomes among college students, but effectiveness may differ across demographic subgroups. Culturally and sex-tailored strategies that consider the intersecting influences of sex, race, and ethnicity may enhance intervention effectiveness during the transition to college.

5

Translation, Validation, and Application of Indonesian Genetic Literacy Questionnaires for Medical Students

Kemal, R. A.; Dhani, R.; Simanjuntak, A. M.; Rafles, A. I.; Triani, H. X.; Rahmi, T. M.; Akbar, V. A.; Firdaus, F.; Pratama, B. F.; Zulharman, Z.

2026-04-25 medical education 10.64898/2026.04.17.26350524 medRxiv

Top 0.3%

0.9%

Show abstract

Background: Increasing relevance of genetics and molecular biology in medicine necessitates greater genetic literacy among healthcare workers. To assess the literacy level, a validated genetic literacy questionnaire is needed. Therefore, a standardised Indonesian-language genetic literacy questionnaire is essential. Aims: We aimed to translate and validate three genetic literacy questionnaires (PUGGS, iGLAS, and UNC-GKS) for use among Indonesian medical students. We then evaluated genetic literacy levels using one of the validated questionnaires. Methods: The PUGGS, iGLAS, and UNC-GKS questionnaires were translated into Indonesian and then reviewed by an expert panel for translational accuracy and conceptual appropriateness. Back-translation was performed to confirm validity. Initial Indonesian versions of the questionnaires underwent cognitive pre-testing with 12 undergraduate medical students. After refinements, the questionnaires were validated among 34 first- to third-year medical students. The Indonesian version of UNC-GKS questionnaire was then used to assess genetic literacy of 486 medical students comprising 228 preclinical medical students, 187 clerkships, and 71 residents. Results: The Indonesian versions of PUGGS (Cronbach's = 0.819) and UNC-GKS ( = 0.809) demonstrated good reliability, while iGLAS showed poor reliability ( = 0.315). Among the 486 students tested, 56% demonstrated moderate overall genetic literacy, and only 15.2% demonstrated good overall literacy. Basic genetic concepts were relatively well-understood with 54.3% having good literacy. On the contrary, gene variant's effects on health were poorly understood with only 9.7% having good literacy. Inheritance concepts were moderately understood with 24.9% having good literacy. Conclusion: The Indonesian translations of PUGGS and UNC-GKS are reliable tools for assessing genetic literacy among medical students. Using UNC-GKS, we observed predominantly moderate genetic literacy levels. Curriculum improvement to better integrate genetics education is essential to support its clinical applications.

6

The Golden Opportunity or the Cutting Room Floor? Quantifying and Characterizing the Loss and Addition of Social Determinants of Health during Clinician Editing of Ambient AI Documentation

Kim, S.; Guo, Y.; Sutari, S.; Chow, E.; Tam, S.; Perret, D.; Pandita, D.; Zheng, K.

2026-04-22 health systems and quality improvement 10.64898/2026.04.20.26351322 medRxiv

Top 0.5%

0.7%

Show abstract

Social determinants of health (SDoH) are important for clinical care, but it remains unclear how much AI-captured social context is preserved after clinician editing in ambient documentation workflows. We retrospectively analyzed 75,133 paired ambient AI-drafted and clinician-finalized note sections from ambulatory care at a large academic health system. Using a rule-based NLP pipeline, we extracted 21 SDoH categories and quantified retention, deletion, and addition. SDoH appeared in 25.2% of AI drafts versus 17.2% of final notes. At the mention level, AI captured 29,991 SDoH mentions, of which 45.1% were deleted, 54.9% were retained with clinicians adding 3,583 new mentions. Insurance and marital status were most often deleted, whereas substance use and physical activity were more often retained. Deletion patterns also varied by specialty, supporting the need for specialty-aware ambient AI systems.

7

Decision Curve Analysis for Evaluating Machine Learning Models for Next-Day Transfer Out of ICU

Pozo, M.; Pape, A.; Locke, B.; Pettine, W. W.

2026-04-21 health informatics 10.64898/2026.04.19.26351213 medRxiv

Top 0.6%

0.5%

Show abstract

Timely identification of intensive care unit (ICU) patients likely to exit the unit can support anticipatory workflows such as chart review, eligibility screening, and patient outreach prior to transfer. Most ICU discharge prediction studies report discrimination and calibration, but these metrics do not quantify the decision consequences of acting on predictions. Using adult ICU admissions from MIMIC-IV, we represented each ICU stay as a sequence of daily clinical summaries and trained logistic regression, random forest, and XGBoost models to predict next day ICU transfer. Models achieved ROC AUC of 0.80-0.84 with differing calibration. We evaluated decision utility using decision curve analysis (DCA), where positive predictions trigger proactive review. Across thresholds, model guided strategies outperformed review-all, review-none, and a simple clinical rule. To translate net benefit into implementable operations, we modeled a clinical trial recruitment workflow with an 8 hour daily time constraint, incorporating chart review and consent effort. At a feasible operating threshold (0.23), the model flagged [~]23 charts/day and yielded [~]1.23 enrollments/day under conservative eligibility and consent assumptions. These results demonstrate that DCA provides a transparent framework for determining when ICU transfer predictions are worth using and how thresholds should be selected to align with real world workflow constraints. Data and Code AvailabilityThis research has been conducted using data from MIMIC-IV. Researchers can request access via PhysioNet. Implementation code is available upon request.

8

Patterns of maternal transport in a state with levels of maternal care and no formal perinatal regions

Li, J.; Steimle, L. N.; Carrel, M.; Byrd, R. A.; Radke, S. M.

2026-04-22 health systems and quality improvement 10.64898/2026.04.20.26351263 medRxiv

Top 0.6%

0.5%

Show abstract

PurposeTo characterize maternal transport patterns in Iowa, a state with levels of maternal care and without formal perinatal regions, and assess whether transport decisions reflect efficient, risk-appropriate coordination. MethodsWe analyzed 2010-2023 Iowa birth records, which included 2,251 maternal transports between obstetric facilities across 106 unique routes. We characterized transport patterns and applied a community detection algorithm to identify "communities" of obstetric facilities that disproportionately transport among themselves. FindingsSuburban and rural counties have elevated transport rates compared to urban counties. 2,189 transports (97%) were from lower-to higher-level facilities. Among these, 2,037 (93%) were to Level III tertiary care centers. 567 transports (25.2%) bypassed a closer facility offering an equivalent or higher level of care than its destination facility. Health system affiliation was associated with bypassing transport, indicating potential organizational rather than purely geographic drivers of transport decisions. Three "communities" of obstetric facilities largely shaped by geographic proximity were identified. ConclusionsAlthough Iowa does not have formal perinatal regions, patterns of maternal transport are mostly in line with three de facto regions. Some potential inefficiencies were identified, such as obstetric facilities transporting to a farther facility when a closer facility offered the same level of care or higher. These findings may help identify opportunities to enhance care coordination among obstetric facilities, optimize maternal transport networks, and improve regionalization of maternal care.

9

Trends and epidemiological profile of preventable hospitalizations in Honduras (2014 - 2024): An 11-year analysis of ambulatory care sensitive conditions

Alfaro, H. E.; Lara-Arevalo, J.

2026-04-24 health policy 10.64898/2026.04.22.26351522 medRxiv

Top 0.7%

0.4%

Show abstract

Ambulatory Care Sensitive Conditions (ACSCs) are conditions for which effective and timely primary health care (PHC) can prevent hospitalizations. They are widely used as a proxy indicator of access to and quality of PHC. Despite their relevance, evidence from Central America remains scarce. This study aimed to quantify the burden, describe the epidemiological profile, and assess temporal trends of ACSCs hospitalizations in Honduras from 2014 to 2024. We conducted a retrospective observational study using national administrative hospital discharge data from all Ministry of Health hospitals. ACSCs were defined using a standardized list of 20 diagnostic groups based on ICD-10 codes. We estimated percentages and sex-age-standardized hospitalization rates per 10,000 inhabitants. Clinical indicators included length of stay (LOS) and in-hospital fatality rates. Temporal trends were evaluated using joinpoint regression models to estimate annual percent changes (APC). Analyses included stratification by age, sex, and disease category. A total of 4,023,944 hospitalizations were analyzed, of which 547,486 (13.6%) were classified as ACSCs. The overall sex-age-standardized rate was 54.1 per 10,000 inhabitants. ACSCs' standardized rates increased between 2014 and 2018 (APC: 2.7%; 95% CI: -2.4; 15.2), declined sharply between 2018 and 2021 (APC: -17.8%; 95% CI: -30.6; -10.3), and increased again between 2021 and 2024 (APC: 15.9%; 95% CI: 4.6; 37.6). Despite this rebound, rates remained below pre-pandemic levels. ACSCs were concentrated among children under 5 years (27.7%) and adults aged 60 years and older (29.9%). Noncommunicable diseases accounted for 56.8% of cases, with diabetes mellitus as the leading cause. Compared with non-ACSCs hospitalizations, ACSCs were associated with longer LOS (4.9 vs. 3.9 days; p <0.001) and higher in-hospital fatality rates (2.4% vs. 1.7%; p <0.001). ACSCs hospitalizations constitute a substantial burden in Honduras and reflect persistent gaps in PHC performance. Strengthening PHC resilience and capacity, particularly for chronic disease management and vulnerable populations, is essential to reduce avoidable hospitalizations and improve health system efficiency and equity.

10

Artificial-Intelligence-Enabled Early Malnutrition Risk Assessment Tools for Elderly Trauma Patients in Intensive Care Units

Wei, X.; Xao, X.; Hou, J.; Wang, Q.

2026-04-27 nutrition 10.64898/2026.04.26.26351765 medRxiv

Top 0.7%

0.4%

Show abstract

Background & Aims: Accurate assessment of clinical malnutrition using anthropometric and functional indicators could improve the care of elderly trauma patients in intensive care units (ICUs). This study aimed to develop an AI-driven malnutrition assessment toolbox based on a minimal set of clinically feasible indicators. Methods: Multiple machine learning models, including logistic regression, support vector machines, k-nearest neighbors, decision trees, random forests, XGBoost, and neural-network-based ensemble models, were developed using different indicator configurations from a clinically collected patient dataset. Models were trained using baseline and longitudinal measurements to predict malnutrition risk. SHAP analysis was used to interpret the importance of selected indicators. Results: Baseline (Day 1) data alone did not provide a reliable prediction, whereas longitudinal measurements substantially improved performance. Models based on a minimal indicator set, including bilateral mid-upper arm circumference, calf circumference, and key static variables, outperformed models using the full indicator set. Tree-based methods consistently outperformed linear and distance-based models, with the three-time-point XGBoost achieving the best individual performance. Neural-network-based ensemble models further improved predictive stability. The best overall performance was achieved by the ensemble model using the minimal indicator set from Day 1 and Day 3. SHAP analysis confirmed the importance of the selected indicators. Conclusions: This AI-driven toolbox provides an efficient and clinically feasible approach for early malnutrition assessment in elderly trauma patients in the ICU. Its strong performance with a minimal indicator set supports its potential for integration into clinical workflows and future digital twin systems for intelligent nutritional management.

11

Effect of NHS surgical hubs on elective primary hip-and-knee replacement volume, length of stay and waiting times: national longitudinal difference-in-differences study

Wen, J.; Anteneh, Z.; Castelli, A.; Street, A.; Gutacker, N.; Scantlebury, A.; Glerum-Brooks, K.; Davies, S.; Bloor, K.; Rangan, A.; Castro Avila, A.; Lampard, P.; Adamson, J.; Sivey, P.

2026-04-22 health policy 10.64898/2026.04.21.26351383 medRxiv

Top 0.8%

0.3%

Show abstract

ObjectivesTo evaluate the effect of surgical hubs on the volume of surgeries, patient waiting times, and length of hospital stay for elective hip and knee replacements in the English NHS. DesignA retrospective longitudinal study using a difference-in-differences approach to compare changes in outcomes at NHS trusts that opened surgical hubs with those that did not. SettingThe study was set in the English NHS, using administrative data from NHS acute trusts providing elective hip and knee replacements between April 2014 and September 2024. ParticipantsThe study included 76 NHS trusts. The treatment group consisted of 29 trusts that opened a surgical hub for trauma and orthopaedic surgery during the study period. The control group consisted of 47 trusts that did not. 48 trusts that performed fewer than 1,000 relevant procedures over the ten-year period or that reported data for fewer than 41 of the 42 quarters in the sample period were excluded. InterventionThe phased introduction of surgical hubs dedicated to elective procedures at 29 NHS trusts between Q1 2020 and Q3 2024. Main outcome measuresThe three main outcomes were, measured at the trust-quarter level: the total number of elective primary hip and knee replacements (surgical volume), the average length of stay in hospital, and the average waiting time from being added to the waiting list to hospital admission. ResultsThe opening of a surgical hub was associated with an increase of 43.75 hip and knee replacement surgeries per quarter (95% CI: 22.22 to 65.28), which represents a 19.1% increase compared to the pre-hub mean. Length of stay was reduced by 0.32 days (95% CI: - 0.48 to -0.16), a 7.8% reduction. There was no statistically significant effect on average waiting times (-14.96 days, 95% CI: -33.11 to 3.19). ConclusionsSurgical hubs appear to be effective at increasing the number of hip and knee replacements and reducing the time patients spend in hospital. However, in this study, they did not lead to a statistically significant reduction in waiting times overall.

12

MIMIC-IV-Phenotype-Atlas (MIPA) : A Publicly Available Dataset for EHR Phenotyping

Yamga, E.; Goudrar, R.; Despres, P.

2026-04-24 health informatics 10.64898/2026.04.16.26350888 medRxiv

Top 0.8%

0.3%

Show abstract

Introduction Secondary use of electronic health records (EHRs) often requires transforming raw clinical information into research-grade data. A central step in this process is EHR phenotyping - the identification of patient cohorts defined by specific medical conditions. Although numerous approaches exist, from ICD-based heuristics to supervised learning and large language models (LLMs), the field lacks standardized benchmark datasets, limiting reproducibility and hindering fair comparison across methods. Methods We developed the MIMIC-IV Phenotype Atlas (MIPA) dataset, an adaptation of MIMIC-IV that provides expert-annotated discharge summaries across 16 phenotypes of varying prevalence and complexity. Two independent clinicians reviewed and labeled the discharge summaries, resolving disagreements by consensus. In parallel, we implemented a processing pipeline that extracts multimodal EHR features and generates training, validation, and testing datasets for supervised phenotyping. To illustrate MIPA's utility, we benchmarked four phenotyping methods : ICD-based classifiers, keyword-driven Term Frequency-Inverse Document Frequency (TF-IDF) classifiers, supervised machine learning (ML) models, and LLMs on the task. Results The final MIPA corpus consists of 1,388 expert-annotated discharge summaries. Annotation reliability was high (mean document-level kappa = 0.805, mean label-level kappa = 0.771), with 91% of disagreements resolved through consensus review. MIPA provides high-quality phenotype labels paired with structured EHR features and predefined train/validation/test splits for each phenotype. In the benchmarking case study, LLMs achieved the highest F1 scores in 13 of 16 phenotypes, particularly for conditions requiring contextual interpretation of clinical narrative, while supervised ML offered moderate improvements over rule-based baselines. Conclusion MIPA is the first publicly available benchmark dataset dedicated to EHR phenotyping, combining expert-curated annotations, broad phenotype coverage, and a reproducible processing pipeline. By enabling standardized comparison across ICD-based heuristics, ML models, and LLMs, MIPA provides a durable reference resource to advance methodological development in automated phenotyping.

13

A profile analysis of peripherally inserted central catheters implanted over 10 years in a quaternary hospital

da Luz, C. C.; Sorbello, C. C. J.; Epifanio, E. A.; dos Santos, C. d. A.; Brandi, S.; Guerra, J. C. d. C.; Wolosker, N.

2026-04-23 health systems and quality improvement 10.64898/2026.04.22.26351492 medRxiv

Top 1.0%

0.3%

Show abstract

Abstract: Background: Vascular access is essential in treating patients undergoing prolonged endovenous therapy such as chemotherapy, antibiotics, and parenteral nutrition. Since the 1990s, when PICCs (peripherally inserted central catheters) appeared, vascular access options have expanded significantly, revolutionizing the treatment landscape for all types of patients. Objective: To analyze and describe the profile of the use of PICCs in a Brazilian quaternary hospital over 10 years with data collected by the infusion therapy team. Evaluating the number of PICCs implanted over the years, patients epidemiology and clinical characteristics, insertion details, associated complications, and the reason for removal. Methods: A retrospective cohort study that employs a quantitative, non-experimental approach to classify and statistically analyze past events associated with 21,652 PICCs implanted from January 2012 to December 2021 in a quaternary hospital at Sao Paulo - Brazil. All the catheters were implanted, and the data was collected by a team of nurses specializing in infusion therapy. We analyzed the number of catheters implanted over the years, insertion characteristics, patients epidemiology and clinical data, possible associated complications, and the reason for removal. Statistical analyses were conducted using R software (version 4.4.1) and SPSS (version 29) for Windows (IBM Corp, Armonk, NY). Results: During the specified period, 21,652 catheters were analyzed. The patients gender distribution was nearly balanced (48.2% versus 51.8%), and the average age was 66 years. Cardiovascular and metabolic issues were the most common comorbidities, and between 2020 and 2021, 29.3% of the sample tested positive for COVID-19. The most common location of hospitalization and implantation was the medical-surgical clinic (31.6% - 41.4%), and the most used type of catheter was the Power Picc (83.9%). The estimated complication incidence density is 2.94 complications per 1,000 catheter-days. Almost all the PICCs (98,2%) were adequately located at the cavo-atrial junction after the first attempt, 82.2% of catheters were removed after therapy, and the median duration of catheter use was 12 days. Conclusion: PICCs are widely employed for drug infusion, with their use growing progressively due to specialized teams greater availability and training. The high efficiency of these devices with a relatively low risk of complications already observed in previous studies was reinforced by the findings of this study of more than 20,000 catheters.

14

Post-Diarrheal Nutritional Trajectories Among Malnourished Children: A Clustering and Multinomial Modelling Approach

Ogwel, B.; Awuor, A. O.; Onyando, B. O.; Ochieng, R.; Hossain, M. J.; Conteh, B.; Mujahid, W.; Shaheen, F.; Munthali, V.; Malemia, T.; Tapia, M.; Keita, A. M.; Nasrin, D.; Kosek, M. N.; Qadri, F.; Kotloff, K. L.; Pavlinac, P. B.; McQuade, E. T. R.

2026-04-21 nutrition 10.64898/2026.04.20.26351264 medRxiv

Top 1.0%

0.3%

Show abstract

Although the co-occurrence of diarrhea and malnutrition is well documented, research has largely focused on the acute management of diarrheal illness. Despite its importance, longitudinal evidence characterizing post-diarrheal recovery trajectories is sparse. We sought to characterize post-diarrheal nutritional recovery trajectories among children aged 6-35 months who were malnourished at enrollment using data from the Enterics for Global Health (EFGH) Shigella Surveillance study (2022-2024). EFGH enrolled children aged 6-35 months presenting with medically-attended diarrhea and followed them at 4 weeks and 3 months post-enrollment. This analysis included children with baseline wasting, stunting, or underweight (z-score < -2) and complete anthropometric follow-up. Latent class mixed-effects models were used to identify distinct post-diarrheal growth trajectories based on changes in anthropometric z-scores over time. Multinomial modified Poisson regression models examined associations between baseline factors and trajectory membership. Among 9,480 enrolled children, 16.5% (n=1,561) were wasted, 22.7% (n=2,155) stunted, and 21.0% (n=1,994) underweight at baseline. Wasting showed greater recovery potential (80.8%) compared with stunting (38.5%) and underweight (40.3%). Recovery was shaped by factors across multiple levels. Clinical severity markers ( prolonged diarrhea, dehydration, and hypoxemia) increased the risk of nutritional failure. Age also influenced outcomes: infants were more likely to worsen, whereas older toddlers more often experienced stagnation. Interventions including exclusive breastfeeding, oral rehydration therapy, appropriate antibiotics, and zinc supplementation, improved outcomes, while unimproved sanitation undermined recovery. These findings highlight the need for integrated strategies combining infection control, nutritional rehabilitation, and water, sanitation, and hygiene interventions tailored to the childrens developmental stage. Key MessagesO_LIPost-diarrheal nutritional recovery is highly heterogeneous, with wasting showing the greatest potential for improvement, while stunting and underweight often result in persistent growth stagnation. C_LIO_LIBaseline anthropometric deficits alone are insufficient to predict recovery, highlighting the need for dynamic monitoring and individualized management. C_LIO_LIInfants are particularly vulnerable to acute nutritional deterioration, while older toddlers frequently experience growth stagnation. C_LIO_LIModifiable protective factors including exclusive breastfeeding, ORS, zinc, and appropriate antibiotics, improved outcomes, whereas poor sanitation undermined recovery. C_LIO_LIIntegrated strategies, tailored to a childs developmental stage, combining clinical care, nutrition, and environmental interventions are critical to support sustained child growth and development. C_LI

15

Most Instability Phases Resolve: Empirical Evidence for Trajectory Plasticity in Multimorbidity Care from Longitudinal Relational Monitoring

Martin, C. M.; henderson, i.; Campbell, D.; Stockman, K.

2026-04-24 health informatics 10.64898/2026.04.22.26351537 medRxiv

Top 1%

0.3%

Show abstract

Background: The instability-plasticity framework proposes that multimorbidity trajectories periodically enter instability phases that are vulnerable to escalation but also potentially modifiable through relational intervention. Whether such phases commonly resolve without acute care, or predominantly progress to hospitalisation, has not been quantified at scale. Objective: To quantify instability window outcomes across a longitudinal monitoring cohort; to test whether the characteristics distinguishing admitted from resolved windows reflect within-patient trajectory dynamics or between-patient severity; and to characterise which patient-reported and operator-rated signals reliably precede admission, using both a curated pilot sub-cohort and the full monitoring cohort with an explicit cross-cohort comparison. Methods: Two complementary analyses were conducted on data from the MonashWatch Patient Journey Record (PaJR) relational telehealth system. Instability windows were identified algorithmically (>=2 consecutive calls with Total_Alerts >=3) across the full longitudinal dataset (16,383 calls, 244 patients, 2.5 years) and classified by linkage to ED and hospital admission data. Window characteristics were compared at window, patient, and paired within-patient levels. Pre-admission signal cascades were analysed in two configurations: a curated pilot sub-cohort (64 patients, 280 calls, +/-10-day window, 103 admissions, December 2016-September 2017) and the full monitoring cohort (175 patients, 1,180 pre-admission calls, +/-14-day window, December 2016-July 2019). A three-way cross-cohort comparison decomposed differences between the two configurations into pipeline and population effects. Results: 621 instability windows were identified across 157 patients (64% of the monitored cohort). 67.3% resolved without hospital admission or ED attendance, a rate stable across alert thresholds 1-5. In paired within-patient analysis (n = 70), duration in days (p = 0.002) and multi-domain breadth (p < 0.001) distinguished admitted from resolved windows; alert intensity did not. In the pilot sub-cohort, patient-reported illness prognosis (Q21) was the dominant pre-admission signal (GEE beta = +0.058, AUC = 0.647, p-BH = 0.018). This finding did not replicate in the full cohort: Q21 was non-significant (GEE beta = -0.008, p = 0.154, AUC = 0.507). Cross-cohort analysis identified selective curation of the pilot sub-cohort as the primary explanation. In the full cohort, six signals escalated significantly before admission after Benjamini-Hochberg correction: total alerts, health impairment (Q26), red alerts, self-rated health (Q3), patient concerns (Q1), and operator concern (Q34). Health impairment achieved the highest individual AUC (0.605) and showed the longest pre-admission lead. No individual signal exceeded AUC 0.61. Conclusions: Two thirds of instability phases resolve without hospitalisation, providing direct empirical support for trajectory plasticity as a clinically frequent phenomenon. Within the same patient, persistence - in duration and in the consistency of high-severity multi-domain flagging across calls - distinguishes trajectories that tip into admission from those that resolve. The Q21 signal reversal between cohorts illustrates how selective curation can produce compelling but non-replicable findings in monitoring research. In the full population, objective alert signals and operator judgement, rather than patient illness prognosis, carry the pre-admission signal

16

A Systematic Exploration of LLM Behavior for EHR phenotyping

Yamga, E.; Murphy, S.; Despres, P.

2026-04-24 health informatics 10.64898/2026.04.16.26350890 medRxiv

Top 1%

0.3%

Show abstract

Background Electronic health record (EHR) phenotyping underpins observational research, cohort discovery, and clinical trial screening. Large language models (LLMs) offer new capabilities for extracting phenotypes from unstructured text, but their performance depends on pipeline design choices-including prompting, text segmentation, and aggregation. No systematic framework has previously examined how these parameters shape accuracy and reproducibility. Methods We evaluated LLM-based phenotyping pipelines using 1,388 discharge summaries across 16 clinical phenotypes. A full factorial experiment with LLaMA-3B, 8B, and 70B systematically varied three pipeline components: prompting (zero-shot, few-shot, chain-of-thought, extract-then-phenotype), chunking (none, naive, document-based), and aggregation (any-positive, two-vote, majority), yielding 24 configurations per model. To compare intrinsic model capabilities, biomedical domain-adapted, commercial frontier (LLaMA-405B, GPT-4o, Gemini Flash 2.0), and reasoning-optimized models (DeepSeek-R1) were evaluated under a fixed configuration. Performance was assessed using precision, recall, and macro-F1; secondary analyses examined prediction consistency (Shannon entropy), self-confidence calibration, and the development of a taxonomy of recurrent model errors. Results Factorial ANOVAs showed that chunking and aggregation were the dominant drivers of performance, whereas the prompting strategy contributed minimally. Configuration effects were stable across model sizes, with no significant Model x Parameter interactions. Phenotype difficulty varied substantially (macro-F1 = 0.40-0.90), yet the highest-performing configuration-whole-document inference without aggregation-was consistent across phenotypes, as confirmed by mixed-effects modeling. In cross-model comparisons, DeepSeek-R1 achieved the highest macro-F1 (0.89), while LLaMA-70B matched GPT-4o and LLaMA-405B at substantially lower cost. Prediction entropy was low overall and driven primarily by phenotype difficulty rather than prompting or temperature. Self-confidence calibration was only moderately informative: high-confidence predictions were more accurate, but larger models exhibited systematic overconfidence. Conclusions LLM performance in EHR phenotyping is governed primarily by input structure and model capacity, not prompt engineering. Simple, document-level inference yields robust performance across diverse phenotypes, providing practical design guidance for LLM-based cohort identification while underscoring the continued need for human oversight for challenging phenotypes.

17

The Acceptability and Impact of the Community-Based Blood Pressure Group pilot intervention in Zimbabwe.

Mhino, F. M.; Ndanga, A.; Chivandire, T.; Sekanevana, C.; Mpandaguta, C. E.; Mwanza, T.; Mutengerere, A.; Scott, S.; Chimberengwa, P.; Dixon, J.; Ndhlovu, C. E.; Seeley, J.; Chingono, R. M. S.; Sabapathy, K.

2026-04-22 public and global health 10.64898/2026.04.20.26351307 medRxiv

Top 1%

0.3%

Show abstract

IntroductionOver one billion people worldwide have hypertension. In Zimbabwe, prevalence is an estimated 38%, surpassing the global average of 34%, and >50% of hypertensives are undiagnosed. The Community BP groups (Com-BP) study examined whether community groups of people living with hypertension, provided with BP machines and led by trained Facilitators could improve awareness, screening and support for those diagnosed with hypertension, to help blood pressure (BP) control. We present findings from the quantitative evaluation of the Com-BP pilot intervention. MethodsThe acceptability of the Com-BP intervention, its potential effectiveness in improving knowledge, attitudes and practices (KAP) and in reducing BP among hypertensive adults in Zimbabwe, was evaluated. Cross-sectional surveys using standardised questionnaires, and BP and Body Mass Index (BMI) assessments, were done at the start and end of the pilot intervention. Statistical evidence of difference between baseline and follow-up was examined using Wilcoxon signed-rank test for continuous data and McNemars test for categorical data. ResultsFourteen groups (seven urban and seven rural) were formed and 151 participants joined over a median of 5months. Retention in the groups was 97.9% (137/140 recruited at baseline), with approximately equal numbers from the urban and rural sites. Median age at baseline was 54 years (IQR 45-66y; min-max 30-92y) and the majority (79%, n=108) were female. Most participants (82.5%, n=113) rated their experience of the group sessions as excellent. The proportions of participants with changes in KAP from baseline to endline were as follows: 45.3% (n=62) to 81.0% (n=111) (p=0.004) able to identify at least two pre-disposing factors for hypertension; 65.0% (n=89) to 77.4% (n=106) (p=0.02) reporting [≥]1day of vigorous physical activity/week; 28.5% (n=39) to 13.9% (n=19) (p=0.001) reporting salt added to meals at the table. There was no statistical evidence of any difference in medication adherence, p=0.06. The proportion of participants with uncontrolled hypertension was 58.1% (n=79) at baseline and reduced to 31.8% (n=43) at follow-up (p<0.001). DiscussionCommunity groups for improving awareness, detection and support are acceptable and led to improvements in self-reported KAP and prevalence of uncontrolled BP. Further research on the sustainability and impact of the intervention is required.

18

Large language models and retrieval augmented generation for complex clinical codelists: evaluating performance and assessing failure modes

Matthewman, J.; Denaxas, S.; Langan, S.; Painter, J. L.; Bate, A.

2026-04-24 health informatics 10.64898/2026.04.23.26351098 medRxiv

Top 1%

0.3%

Show abstract

Objectives: Large language models (LLMs) have shown promise in creating clinical codelists for research purposes, a time-consuming task requiring expert domain knowledge. Here, we evaluate the performance and assess failure modes of a retrieval augmented generation (RAG) approach to creating clinical codelists for the large and complex medical terminology used by the Clinical Practice Research Datalink (CPRD). Materials & Methods: We set up a RAG system using a database of word embeddings of the medical terminology that we created using a general-purpose word embedding model (gemini-embedding). We developed 7 reference codelists presenting different challenges and tagged required and optional codes. We ran 168 evaluations (7 codelists, 2 different database subsets, 4 models, 3 epochs each). Scoring was based on the omission of required codes, and inclusion of irrelevant codes. We used model-grading (i.e., grading by another LLM with the reference codelists provided as context) to evaluate the output codelists (a score of 0% being all incorrect and 100% being all correct). Results: We saw varying accuracy across models and codelists, with Gemini 3 Pro (Score 43%) generally performing better than Claude Sonnet 4.6 (36%), Gemini 3 Flash, and OpenAI GPT 5.2 performing worst (14%). Models performed better with shorter target codelists (e.g., Eosinophilic esophagitis with four codes, and Hidradenitis suppurativa with 14 codes). For example, all models consistently failed to produce a complete Wrist fracture codelist (with 214 required codes). We further present evaluation summaries, and failure mode evaluations produced by parsing LLM chat logs. Discussion: Besides demonstrating that a single-shot RAG approach is currently not suitable for codelist generation, we demonstrate failure modes including hallucinations, retrieval failures and generation failures where retrieved codes are not used. Conclusions: Our findings suggest that while RAG systems using current frontier LLMs may create correct clinical codelists in some cases, they still struggle with large and complex terminologies and codelists with a large number of codes. The failure mode we highlight can inform the creation of future workflows to avoid failures.

19

Meal Timing Patterns and Associations with Fat Mass in Adolescents

Decker, J. E.; Morales, K. H.; Chen, P.-W.; Master, L.; Kwon, M.; Jansen, E. C.; Zemel, B. S.; Mitchell, J. A.

2026-04-23 nutrition 10.64898/2026.04.22.26351498 medRxiv

Top 1%

0.2%

Show abstract

Background: The timing of energy intake could be important in the development of obesity. However, most observational evidence stems from adults, anthropometric defined obesity outcomes, single meal timing phenotyping, and traditional regression modeling. Objective: We aimed to describe meal timing patterns in adolescents and determine if they associated with fat mass by modeling the median and all other percentiles of the frequency distribution. Methods: We analyzed data from the Sleep and Growth Study 2 (S-Grow2, N=286, 12-13y). Participants completed 3-day 24-hour dietary recalls and time stamped eating occasions were used to define 8 meal timing traits, with aide from self-reported wake and bed timing. Principal component analysis (PCA) identified multi-dimensional meal timing patterns. Fat mass index (FMI) was estimated using dual energy X-ray absorptiometry. Quantile regression assessed if there were associations between meal timing traits and FMI across the entire FMI frequency distribution. Results: The typical first and last eating occasions were 8:00am (40 minutes after waking) and 8:00pm (2.7 hours before sleep), respectively, thus the eating period typically lasted 11.5 hours per day. The typical eating period midpoint was 2:15pm, and the timing when 50% of energy intake was consumed typically occurred at 3:15pm. PCA revealed three meal timing patterns: 1) Delayed Start, Condensed Eating Period (43% of variance; shorter eating period and delayed timing of first eating); 2) Late, Sleep Proximal Eating (30% of variance; later timing of last eating and extended eating period), and 3) Later Energy Intake (10% of variance; delayed energy intake midpoint). Higher scores for the Delayed Start, Condensed Eating Period pattern associated with higher body mass index and FMI at the upper tails of their distributions. Conclusions: Distinct multidimensional meal timing patterns emerged in early adolescence, with the delayed start, condensed eating period pattern potentially associated with higher adiposity.

20

Drivers and barriers to the implementation of the school feeding values-based food procurement guidelines and ultra-processed food restrictions

Fernandes Davies, V.; Perrut, I.; Thow, A.-M.; Duran, A. C.

2026-04-24 health policy 10.64898/2026.04.22.26351508 medRxiv

Top 2%

0.2%

Show abstract

Objective: To investigate in the National School Feeding Program (PNAE) the local level drivers and barriers to the implementation of four guidelines: the banning of sugary drinks; restrictions on the procurement of processed and ultra-processed foods; the mandatory increase in weekly servings of fruits and vegetables offered to students; and mandatory direct procurement from family farmers. Design: Qualitative study that used semi-structured interviews. Street level bureaucracy theory informed the theoretical framework and thematic analysis. Setting: Brazilian municipalities, across the country five geographic regions (North, Northeast, Southeast, South, and Midwest). Participants: Stakeholders (e.g. nutritionists, school cooks, and food procurement managers) involved in the local implementation of the PNAE program across the country. Results: Ninety stakeholders were interviewed. Stakeholders reported having autonomy to perform their activities, collaboration and support from other members within the local government and food providers, adequate infrastructure such as a well-equipped kitchens, the availability of trained personnel, and political commitment as drivers for optimum program implementation. Reported barriers included lack of support and resistance to change among cooks, teachers and parents; insufficient physical and human resources; and limited political commitment. When barriers outweighed drivers, interviewees reported adapting their practices, often in restrictive ways that could compromise the implementation of the program. Conclusions: Drivers and barriers to local PNAE implementation were generally similar across studied municipalities, although their magnitude varied. In contexts of greater economic vulnerability and fiscal constraint, additional support and targeted actions from the federal government may be required to strengthen local implementation